Skip to content
This repository has been archived by the owner on May 25, 2022. It is now read-only.

[RDST-111] keep tensorboard directory clean during s3 sync #3

Merged
merged 2 commits into from
Jan 24, 2018

Conversation

jbencook
Copy link

This PR is a workaround to make sure TensorBoard picks up logs that get synced in a local directory. The problem seems to be that aws s3 sync creates temporary files while it's downloading them. These are lexicographically greater than the files they are updating (because they append .<some hash> to the end). But TensorBoard stops looking at a file when a new one shows up (explanation here).

@@ -38,6 +38,7 @@ def __init__(self, estimator, logdir=None):
threading.Thread.__init__(self)
self.event = threading.Event()
self.estimator = estimator
self.aws_sync_dir = tempfile.mkdtemp()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My gut says this should be self._aws_sync_dir, but it's not clear to me why some fields are considered public while others aren't, so it probably doesn't matter.

"""Sync to_directory with from_directory by copying each file in
to_directory with new contents. Why do this? Because TensorBoard picks
up temp files from `aws s3 sync` and then stops reading the correct
tfevent files.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a relevant TensorBoard issue we could include here? I'm just curious if that could be an easier reference for the future in understanding this and knowing if/when it's fixed, versus using git blame to track down this discussion.

This is the closest one I came across, though it's slightly different: tensorflow/tensorboard#70

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I still think it's probably related to tensorflow/tensorboard#349. I'll put that in the docstring.

from_directory (str): The directory with updated files.
to_directory (str): The directory to be synced.
"""
for root, dirs, files in os.walk(from_directory):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll want to check to see if to_directory exists before this loop, then create it if it doesn't. From what I can tell, running aws s3 sync src_dir dst_dir will create dst_dir for you, but it looks like we're only attempting to create subdirectories of to_directory rather than the directory itself.

Copy link
Collaborator

@jalabort jalabort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fairly elegant and only Python solution that solves a problem that must be affecting quite a lot of people. We should consider opening a PR upstream.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants